What is IsoForma?

IsoForma is a package for quantifying positional isomers (QPI) in MS2 spectra data. Currently, analysis of this type of data requires the use of several separate tools which is inconvenient and time-consuming. This goal of this software is to offer all the functionality needed for this analysis in a streamlined package.

Much of the backend functionality is drawn from the pspecterlib package, including generating metadata objects. More information about the backend package can be found here.

IsoForma was built to ingest two main types of data: 1) an MS file (XML-based or ThermoFisher raw) or 2) a list of peak_data objects that can be generated with pspecterlib. If an MS file is provided, automatic MS2 peak detection options are provided. Otherwise, the provided peak data is simply summed together.

Here are the general steps of the IsoForma algorithm and their respective functions:

  1. Select scan numbers: Either manually or with pull_scan_numbers()

  2. Sum peaks: sum_ms2_spectra()

  3. Match experimental and literature fragments for every proteoform: fragments_per_ptm()

  4. Sum isotopes and charge states per fragment per proteoform: sum_isotopes()

  5. Calculate an abundance matrix: abundance_matrix()

  6. Calculate proteoform relative proportions: calculate_proportions()

Steps 3-6 can be run all together with our main pipeline function.

1. Select scan numbers

To select scan numbers, either use the pull_scan_numbers() function to automatically detect and suggest MS2 peaks, or select them yourself and make pspecter peak_data objects out of them. See ?pspecterlib::make_peak_data or ?pspecterlib::get_peak_data.

# Make a list of pspecterlib peak_data objects
PeakDataList <- list(
  readRDS(system.file("extdata", "PeakData_1to1to1_1.RDS", package = "isoforma")),
  readRDS(system.file("extdata", "PeakData_1to1to1_2.RDS", package = "isoforma")),
  readRDS(system.file("extdata", "PeakData_1to1to1_3.RDS", package = "isoforma"))
)

head(PeakDataList[[1]]) %>% knitr::kable()
M/Z Intensity Abundance
151.5681 483.0363 0.0313
151.5682 930.2599 0.0603
151.5683 1144.0471 0.0742
151.5684 1003.7305 0.0651
151.5686 631.3395 0.0409
151.5922 461.2453 0.0299

2. Sum peaks

The peak summing function will either take a scan_metadata object from pspecterlib and the selected scan_numbers from pull_scan_numbers() and sum the results, or use a list of pspecterlib peak_data objects. This function will return one summed peak_data object.

# Sum selected peaks together 
PeaksSum <- sum_ms2_spectra(
  PeakDataList = PeakDataList,
  PPMRound = 5,
  MinimumAbundance = 0.01
)
head(PeaksSum) %>% knitr::kable()
M/Z Intensity Abundance
150.2730 693.4443 0.0153
150.2735 3721.4915 0.0820
150.9515 1706.9498 0.0376
150.9520 3536.4570 0.0779
150.9740 3425.3976 0.0754
150.9745 670.2090 0.0148

3. Match experimental and literature fragments for every proteoform

To generate all proteoforms to test, use the pspecterlib::multiple_modifications function. Then, pass that list of sequences to the fragments_per_ptm function. If the isotoping algorithm crashes, considering switching the to IsotopeAlgorithm = “isopat”. This function will return a list of matched_peak objects from pspecterlib.

# Generate a list of PTMs to test
MultipleMods <- pspecterlib::multiple_modifications(
  Sequence = "LQIFVKTLTGKTITLEVEPSDTIENVKAKIQDKEGIPPDQQRLIFAGKQLEDGRTLSDYNIQKESTLHLVLRLRGG",
  Modification = "6.018427,V(17,26,70)[1]",
  ReturnUnmodified = TRUE
)

# Calculate fragments per proteform
AllFragments <- fragments_per_ptm(
   Sequences = MultipleMods,
   SummedSpectra = PeaksSum,
   PrecursorCharge = 11, 
   ActivationMethod = "ETD", 
   CorrelationScore = 0, # Here, we don't care about correlation score filtering 
   Messages = FALSE
)

head(AllFragments[[2]]) %>% knitr::kable()
PPM Error Ion Z Isotope M/Z M/Z Experimental M/Z Tolerance Isotopic Percentage Intensity Experimental Correlation Score Type General Type Modifications Molecular Formula Position N Position Residue Sequence
-9.7607150 c12 6 M 225.4787 225.4765 0.0022548 100.00000 684.4547 NA c c C63H109N15O17 12 12 T12 LQIFVKTLTGKT
0.1350431 c2 1 M 259.1765 259.1765 0.0025918 100.00000 2941532.0312 NA c c C11H21N3O4 2 2 Q2 LQ
-8.1864786 c2 1 M+1 260.1851 260.1830 0.0026019 13.80402 2555.8947 NA c c C11H21N3O4 2 2 Q2 LQ
0.3082109 z23 9 M 295.6129 295.6130 0.0029561 68.07372 683.7947 NA z z C116H197N37O35 23 54 R54 RTLSDYNIQKESTLHLVLRLRGG
6.1756975 c21 7 M 334.7684 334.7705 0.0033477 76.52361 1072.7388 NA c c C106H178N24O34 21 21 D21 LQIFVKTLTGKTITLEVEPSD
-0.0671573 c3 1 M 372.2605 372.2605 0.0037226 100.00000 956499.5312 NA c c C17H32N4O5 3 3 I3 LQI

4. Sum isotopes and charge states per fragment per proteoform

This function will return a table of summed intensities per fragment.

IsotopesSum <- sum_isotopes(IsoformaFragments = AllFragments)

head(IsotopesSum) %>% knitr::kable()
Ion Summed Intensity Proteoform
c10 3471537.65 UnmodifiedSequence
c11 1424215.52 UnmodifiedSequence
c12 11049.11 UnmodifiedSequence
c13 128408.18 UnmodifiedSequence
c14 573659.49 UnmodifiedSequence
c15 182678.46 UnmodifiedSequence

5. Calculate an abundance matrix

This function will return an abundance matrix for a selected ion, where each row is a fragment and each column is a proteoform. The values are summed intensities.

# Select your ion group of choice when calculating the abundance matrix
AbunMat <- abundance_matrix(
  SummedIsotopes = IsotopesSum,
  IonGroup = "c"
)

head(AbunMat) %>% knitr::kable()
Ion
c2 2944087.9 2944087.9 2944087.9
c3 957655.9 957655.9 957655.9
c4 665981.4 665981.4 665981.4
c5 635080.4 635080.4 635080.4
c6 405053.6 405053.6 405053.6
c7 1050680.0 1050680.0 1050680.0

6. Calculate proteoform relative proportions

This function returns both a table and a plot.

Proportions <- calculate_proportions(AbundanceMatrix = AbunMat)
## Profiling...
## Profiling...
Proportions[[1]] %>% knitr::kable()
Modification Proportion LowerCI UpperCI
0.2341259 0.1769313 0.3078882
0.3299839 0.2888934 0.3699363
0.4358902 0.3959378 0.4769807
Proportions[[2]]

3-6: Main Pipeline Function

Since steps 3-6 are the same for either pre-selected scans or if pull_scan_numbers() is used, we do have an “isoforma_pipeline” function to run them all together.

Accessory Functions

annotated_spectrum_ptms_plot

Visualize multiple PTM fragment identifications over one plot in a large, interactive plotly display.

annotated_spectrum_ptms_plot(
  SummedSpectra = PeaksSum, 
  IsoformaFragments = AllFragments
)

ptm_heatmap()

PPM errors per fragment and ppm combination can easily be visualized with this heatmap function.

ptm_heatmap(IsoformaFragments = AllFragments)

write_mgf_simple()

This simple MGF writer function can be used to generate MGF files of peak data for use with external tools.